Acoustic Factorisation

نویسنده

  • M. J. F. Gales
چکیده

This paper describes a new technique for training a speech recognition system on inhomogenous training data. The proposed technique , acoustic factorisation, attempts to explicitly model all the factors that affect the acoustic signal. By explicitly modelling all the factors the trained model set may be used in a more flexible fashion than in standard adaptive training schemes. Since an individual model is trained for each factor, it is possible to factor-in only those factors that are appropriate to a particular target domain , for example the distribution over all training speakers. The target domain specific factors are simply estimated from limited target specific data, for example the target acoustic environment. The theory of this new approach for a particular speaker and environment transforms is described. Initial experiments on a large vocabulary speech recognition task are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An explicit independence constraint for factorised adaptation in speech recognition

Speech signals are usually affected by multiple acoustic factors, such as speaker characteristics and environment differences. Usually, the combined effect of these factors is modelled by a single transform. Acoustic factorisation splits the transform into several factor transforms, each modelling only one factor. This allows, for example, estimating a speaker transform in a noise condition and...

متن کامل

Speaker and Noise Factorisation for Robust Speech Recognition

Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying “clean” speech signal. This paper examines techniques to handle both speaker and back...

متن کامل

Mutually exclusive grounding for weakly supervised non-negative matrix factorisation

Non-negative Matrix Factorisation (NMF) has been successfully applied for learning the meaning of a small set of vocal commands without any prior knowledge of the language. This kind of learning is useful if flexibility in terms of the acoustic and language model is required, for example in assistive technologies for dysarthric speakers because they do not comply with common models. Vocal comma...

متن کامل

Adaptation of deep neural network acoustic models using factorised i-vectors

The use of deep neural networks (DNNs) in a hybrid configuration is becoming increasingly popular and successful for speech recognition. One issue with these systems is how to efficiently adapt them to reflect an individual speaker or noise condition. Recently speaker i-vectors have been successfully used as an additional input feature for unsupervised speaker adaptation. In this work the use o...

متن کامل

HAC-models: a novel approach to continuous speech recognition

In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech is described by co-occurrence statistics of acoustic events over an analysis window of variable length, leading to a vectorial representation of high but fixed dimension called “Histogram of Acoustic Co-occurrence” (HAC). During training, recurring acoustic patterns are discovered and as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001